File Organization and Indexing 

The data of a RDB is ultimately stored in disk files 

Disk space management: 

Should Operating System services be used ? 
Should RDBMS manage the disk space by itself ? 

2 nd option is preferred as RDBMS requires complete 
control over when a block or page in main memory buffer 
is written to the disk. 

This is important for recovering data when system 
crash occurs 
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Structure of Disks 
Disk 

■ several platters stacked on 
a rotating spindle 

■ one read / write head per surface 
for fast access 

■ platter has several tracks 

• -10,000 per inch 

■ each track - several sectors 

■ each sector - blocks 

■ unit of data transfer - block 

■ cylinder i - track i on all platters 



Platters 




track 



Speed: 
7000 to 
10000 rpm 



< ► 



Read/ write head 




sector 
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Data Transfer from Disk 

Address of a block: Surface No, Cylinder No, Block No 

Data transfer: 

Move the r/w head to the appropriate track 

• time needed - seek time — 12 to 14 ms 

Wait for the appropriate block to come under r/w head 

• time needed - rotational delay — 3 to 4ms (avg) 

Access time: Seek time + rotational delay 

Blocks on the same cylinder - roughly close to each other 

- access time-wise 
- cylinder i, cylinder (i + 1), cylinder (i + 2) etc. 
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Data Records and Files 

Fixed length record type: each field is of fixed length 

• in a file of these type of records, the record number can be 
used to locate a specific record 

• the number of records, the length of each field are available 
in file header 

Variable length record type: 

• arise due to missing fields, repeating fields, variable length 
fields 

• special separator symbols are used to indicate the field 
boundaries and record boundaries 

• the number of records, the separator symbols used are 
recorded in the file header 
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Packing Records into Blocks 



Record length much less than block size 

• The usual case 

• Blocking factor b = [_B/rJ B - block size (bytes) 

r - record length (bytes) 
- maximum no. of records that can be stored in a block 



Record length greater than block size 
• spanned organization is used 




File blocks: 

sequence of blocks containing all the records of the file 



Prof P Sreenivasa Kumar 
Department of CS&E, IITM 



5 



Mapping File Blocks onto the Disk Blocks 

Contiguous allocation 

• Consecutive file blocks are stored in consecutive disk blocks 

• Pros: File scanning can be done fast using double buffering 
Cons: Expanding the file by including a new block in the middle 

of the sequence - difficult 

Linked allocation 

• each file block is assigned to some disk block 

• each disk block has a pointer to next block of the sequence 

• file expansion is easy; but scanning is slow 

Mixed allocation 
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Operations on Files 

Insertion of a new record: may involve searching for appropriate 
location for the new record 

Deletion of a record: locating a record -may involve search; 
delete the record -may involve movement of other records 

Update a record field/fields: equivalent to delete and insert 

Search for a record: given value of a key field / non-key field 

Range search: given range values for a key / non-key field 

How successfully we can carry out these operations 
depends on the organization of the file and the availability 
of indexes 
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Primary File Organization 

The logical policy / method used for placing records into file blocks 

Example: Student file - organized to have students records sorted 
in increasing order of the "rollNo" values 

Goal: To ensure that operations performed frequently on the file 
execute fast 

• conflicting demands may be there 

• example: on student file, access based on rollNo and also 

access based on name may both be frequent 

• we choose to make rollNo access fast 

• For making name access fast, additional access structures 
are needed. 

- more details later 
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Different File Organization Methods 

We will discuss Heap files, Sorted files and Hashed files 
Heap file: 

Records are appended to the file as they are inserted 
Simplest organization 

Insertion - Read the last file block, append the record and 

write back the block - easy 
Locating a record given values for any attribute 

• requires scanning the entire file - very costly 

Heap files are often used only along with other access structures. 
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Sorted files / Sequential files (1/2) 

Ordering field: The field whose values are used for sorting the 

records in the data file 
Ordering key field: An ordering field that is also a key 

Sorted file / Sequential file: 
Data file whose records are arranged such that the values of the 
ordering field are in ascending order 

Locating a record given the value X of the ordering field: 
Binary search can be performed 

Address of the n th file block can be obtained from 
the file header 

0(log N) disk accesses to get the required block- efficient 
Range search is also efficient 



Prof P Sreenivasa Kumar 
Department of CS&E, IITM 



10 



Sorted files / Sequential files (2/2) 

Inserting a new record: 

■ Ordering gets affected 

• costly as all blocks following the block in which insertion is 
performed may have to be modified 

■ Hence not done directly in the file 

• all inserted records are kept in an auxiliary file 

• periodically file is reorganized - auxiliary file and main file 

are merged 

• locating record 

• carried out first on auxiliary file and then the main file. 

Deleting a record 

• deletion markers are used. 
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Hashed Files 

Very useful file organization, if quick access to the data record is 
needed given the value of a single attribute. 

Hashing field: The attribute on which quick access is needed and 

on which hashing is performed 

Data file: organized as a buckets with numbers 0,1, . . ., (M - 1) 
(bucket - a block or a few consecutive blocks) 

Hash function h: maps the values from the domain of the hashing 

attribute to bucket numbers 
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Inserting Records into a Hashed File 



Insertion : for the given record R, 
apply h on the value of hashing 
attribute to get the bucket number r. 

If there is space in bucket r, 
place R there else place R in the 
overflow chain of bucket r. 

The overflow chains of all the 
buckets are maintained in the 
overflow buckets. 



o 



Q 



M-l 




)verflow 
chain 



Overflow 
buckets 



Main buckets 
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Deleting Records from a Hashed File 



Deletion: Locate the record R to be 
deleted by applying h. 

Remove R from its bucket/overflow 
chain. If possible, bring a record from x 
the overflow chain into the bucket 

Search: Given the hash filed value 2 

k, compute r = h(k). Get the bucket 

r and search for the record. If not 

found, search the overflow chain 

of bucket r. m-i 



o 





>verflow 
chain 




Overflow 
buckets 



Main buckets 
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Performance of Static Hashing 

Static hashing: 

■ The hashing method discussed so far 

■ The number of main buckets is fixed 

Locating a record given the value of the hashing attribute 
most often - one block access 

Capacity of the hash file C = r * M records 
(r - no. of records per bucket, M - no. of main buckets) 

Disadvantage with static hashing: 
If actual records in the file is much less than C 

• wastage of disk space 

If actual records in the file is much more than C 

• long overflow chains - degraded performance 
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Hashing for Dynamic File Organization 

Dynamic files 

■ files where record insertions and deletion take place frequently 

■ the file keeps growing and also shrinking 

Hashing for dynamic file organization 

■ Bucket numbers are integers 

■ The binary representation of bucket numbers 

■ Exploited cleverly to devise dynamic hashing schemes 

■ Two schemes 

• Extendible hashing 

• Linear hashing 
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Extendible Hashing (1/2) 

The fc-bit sequence corresponding to a record R: 

Apply hashing function to the value of the hashing field of R 
to get the bucket number r 

Convert r into its binary representation to get the bit sequence 
Take the trailing k bits 
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Extendible Hashing (2/2) 



The # of 
trailing 
bits used in 
the directory 



'Global depth d=3 



000 




001 




010 




Oil 




100 




101 




110 




111 





Directory 



Local depth 



The number of bits in the 
common suffix of bit 
sequences corresponding to 
the records in the bucket 



All records with 2-bit 
Sequence '01' 



All records with 3 -bit 
Sequence '111' 



Locating a record 

Match the d-bit sequence with an entry in the directory and go to 
the corresponding bucket to find the record 
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Insertion in Extendible Hashing Scheme (1/2) 



2 - bit sequence for the record to be inserted: 00 

i 

full 



00 




01 




10 




11 






00 




01 




10 




11 






d=2 



d=2 



b 0 Full: 



bi 



all local 
depth = 2 



Bucket b 0 is split 
All records whose 2-bit sequence is '10' are 
sent to a new bucket b 3 . Others are retained in b 0 
Directory is modified. 
b 0 Not full: New record is placed in b 0 . No changes in the directory. 
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Insertion in Extendible Hashing Scheme (2/2) 

2 - bit sequence for the record to be inserted: 10 , 




b 3 not full: new record placed in b 3 . No changes. 

b 3 full : b 3 is split, directory is doubled, all records with 3 -bit 

sequence 110 sent to b 4 . Others in b 3 . 
In general, if the local depth of the bucket to be split is equal to the 

global depth, directory is doubled 
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Deletion in Extendible Hashing Scheme 




Matching pair of data buckets: 

k-bit sequences have a common k-1 bit suffix, e.g, b 3 & b 4 
Due to deletions, if a pair of matching data buckets 

— become less than half full - try to merge them into one bucket 
If the local depth of all buckets is one less than the global depth 

— reduce the directory to half its size 
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Extendible Hashing Example 

Bucket capacity - 2 Initial buckets = 1 
Insert 45,22 



Global 
depth 

Insert 12 



Insert 1 1 



o 

I 



0 




1 





0 




1 





22 



12 




22 



12 



45 
11 





0 




45 




22 






1 





Local depth 



45 


101101 


22 


10110 


12 


1100 


11 


1011 



Bucket overflows 
local depth = global depth 
=> Directory doubles and split image 

is created 
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Insert 15 



Insert 10 





2 


00 




01 




10 




11 
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» 


45 






00 
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01 




10 




• 10 


11 






22 




Overflow occurs. 

Global depth = local depth 

Directory doubles and split occurs 



45 


101101 


22 


10110 


12 


1100 


11 


1011 


15 


1111 


10 


1010 



Overflows occurs. 
Since local depth < global depth 
Split image is created 
Directory is not doubled 
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Linear Hashing 

Does not require a separate directory structure 

Uses a family of hash functions h 0 , h 1? h 2 ,. . .. 

• the range of 1^ is double the range of h iA 

• h^x) = x mod 2*M 

M - the initial no. of buckets 

(Assume that the hashing field is an integer) 

Initial hash functions 
h Q (x) = x mod M 
h^x) = x mod 2M 
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Insertion (1/3) 

Initially the structure has M main buckets 
( 0 , . . . , M- 1 ) and a few overflow buckets 



To insert a record with hash field value x, 
place the record in bucket h 0 (x) 

When the first overflow in any bucket occurs: 
Say, overflow occurred in bucket s 
Insert the record in the overflow chain of bucket s 
Create a new bucket M — 
Split the bucket 0 by using h x 

Some records stay in bucket 0 and 

some go to bucket M. 
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0 



Overflow 
buckets 




Split image 
of bucket 0 



Insertion (2/3) 

On first overflow, 0 
irrespective of where it occurs, bucket 0 is split 

On subsequent overflows 
buckets 1, 2, 3, ... are split in that order 
(This why the scheme is called linear hashing) 2 

N: the next bucket to be split 

After M overflows, 
all the original M buckets are split. m-i 
We switch to hash functions h 1? h 2 
and set N = 0. 



h 



o 



hi 



hi 

* ho 



hi 
h 



M 



M+1 



Split 
images 



i+1 
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Nature of Hash Functions 

h^x) = x mod 2*M. Let M f = TM 

■ Note that if h(x) = k then x = M f r + k, k < M f 
and h +1 (x) = (M f r + k) mod 2M f = k or M f + k 

Since, 

r - even - (M f 2s + k) mod 2M f = k 

r - odd - ( M f (2s + 1 ) + k ) mod 2M f = M f + k 

M f - the current number of original buckets. 
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Insertion (3/3) 

Say the hash functions in use are h i? h i+1 
To insert record with hash field value x, 
Compute h^x) 

if h^x) < N, the original bucket is already split 

place the record in bucket h i+1 (x) 
else place the record in bucket h^x) 
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Linear Hashing Example 



Initial Buckets = 1 Bucket capacity = 2 records 



Hash functions 
h 0 = x mod 1 
h , = x mod 2 



N 



Split pointer 



0 



Insert 12, 1 1 



N 



0 



11 



Insert 14 



B 0 overflows 
Bucket pointed by 

N is split 
Hash functions are 
changed 



0 



N 



14 



11 



h 0 = x mod 2 
h 1 = x mod 4 
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Insert 13 

N 



0 



12 



14 



11 



13 



Insert 10 



applied here 
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B 0 is split using h 1 
and split image 
is created 
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N 0 



Insert 18 



overflow at B. 

split B l 
h 0 = x mod 4 
hj = x mod 8 
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x mod 2 
h j = x mod 4 
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Index Structures 

Index: A disk data structure 

- enables efficient retrieval of a record 

given the value (s) of certain attributes 
- indexing attributes 

Primary Index: 

Index built on ordering key field of a file 

Clustering Index: 

Index built on ordering non-key field of a file 

Secondary Index: 

Index built on any non-ordering field of a file 
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Primary Index 

Can be built on ordered / sorted files 
Index attribute - ordering key field (OKF) 



Index Entry: 



value of OKF for 


disk address 


the first record of 


of Bj 


a block B| 



Index file: ordered file (sorted on OKF) 

size-no. of blocks in the data file 
Index file blocking factor B¥ { = |_B/(V +P)_ 
(B-block size, V-OKF size, P-block pointer size) 
- generally more than data file blocking factor 
No of Index file blocks b { = b/BF 
(b - no. of data file blocks) 



101 




121 




129 




• 
• 
• 
• 




240 






Ordering key 
(RollNo) 



101 




104 









w 121 




123 









129 




130 









240 




244 









Data 
file 
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Record Access Using Primary Index 

Given Ordering key field (OKF) value: x 
Carry out binary search on the index file 

m - value of OKF for the first record in the middle block k of 
the index file 

x < m: do binary search on blocks 0 - {k -1) of index file 
x>m: if there is an index entry in block k with OKF value x, 

use the corresponding block pointer, 

get the data file block and 

search for the data record with OKF value x 
else do binary search on blocks k+l 9 ... 9 b { of index file 

Maximum block accesses required: n°g2 b[ ~\ 
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An Example 



Data file: 

No. ofblocksb = 9500 

Block size B = 4KB 

OKF length V= 15 bytes 

Block pointer length p = 6 bytes 
Index file 

No. of records v { = 9500 

Size of entry V + P = 21 bytes 

Blocking factor BF { = [_4096/2 lj = 195 

No. of blocks bj =[r i /BF i ] = 49 
Max No. of block accesses for getting record 

using the primary index 
Max No. of block accesses for getting record 
without using primary index 



1 + log 2 b i = 7 
log 2 b l= 14 
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Making the Index Multi-level 

Index file - itself an ordered file 

- another level of index can be built 
Multilevel Index - 

Successive levels of indices are built till the last level has one block 



height - no. of levels 
block accesses: height + 1 
(no binary search required) 



9500 
entries 




49 entries 




For the example data file: 
No of block accesses required with 
multi-level primary index: 3 
without any index: 14 



Second level 

index 

1 block 



First level 
index 
49 blocks 




data file 
9500 blocks 
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Range Search, Insertion and Deletion 

Range search on the ordering key field: 

Get records with OKF value between x x and x 2 (inclusive) 
Use the index to locate the record with OKF value x x and read 
succeeding records till OKF value exceeds x 2 . 
Very efficient 

Insertion: Data file - keep 25% of space in each block free 

— to take care of future insertions 

index doesn't get changed 

— or use overflow chains for blocks that overflow 

Deletion: Handle using deletion markers so that index doesn't get 
affected 

Basically, avoid changes to index 
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Clustering Index 

Built on ordered files where ordering field is not a key 
Index attribute: ordering field (OF) 



Index entry: 



Distinct value Vj 


address of the first 


of the OF 


block that has a record with OF value V i 



Index file: Ordered file (sorted on OF) 
size - no. of distinct values of OF 
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Secondary Index 

Built on any non-ordering field (NOF) of a data file. 
Case I: NOF is also a key (Secondary key) 



value of the NOF V ; 



pointer to the record with V { as the NOF value 



Case II: NOF is not a key: two options 

(i) 



value of the NOF V. 


pointer(s) to the record(s) with V i as the NOF value 




value of the NOF V; 


pointer to a block that has pointer(s) to the record(s) 
with Vj as the NOF value 



(2) 



Remarks: 

(1) index entry - variable length record 

(2) index entry - fixed length - One more level of indirection 
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Secondary Index (key) 

Can be built on ordered and also other type of files 
Index attribute: non-ordering key field 
Index entry: 



value of the NOF V ; 



pointer to the record with V i as the NOF value 



Index file: ordered file (sorted on NOF values) 
No. of entries - same as the no. of records in the data file 



Index file blocking factor = B/(V+P r ) 
(B: block size, V: length of the NOF, 
P r : length of a record pointer) 



Index file blocks = |~r/Bf-l 

(r - no. of records in the data file) 
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An Example 



Data file: 

No. of records r = 90,000 Block size B = 4KB 
Record length R = 100 bytes BF =[4096/100 J = 40, 

b 490000/40]= 2250 



NOF length V= 15 bytes 
Index file : 

No. of records r = 90,000 



length of a record pointer P r = 7 bytes 



BF.= 



4096/22 



record length = V + P r = 22 bytes 



= 186 



No. of blocks b = 



90000/186 



= 484 



Max no. of block accesses to get a record 

using the secondary index 
Avg no. of block accesses to get a record 

without using the secondary index 
A very significant improvement 



1 + 



log 2 b i 



= 10 



b/2 = 1125 
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Multi-level Secondary Indexes 

Secondary indexes can also be converted to multi-level indexes 

First level index 

- as many entries as there are records in the data file 

First level index is an ordered file 

so, in the second level index, the number of entries will be 
equal to the number of blocks in the first level index 
rather than the number of records 

Similarly in other higher levels 
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Making the Secondary Index Multi-level 

Multilevel Index - 
Successive levels of indices are built 

till the last level has one block 
height - no. of levels 
block accesses: height + 1 



3 entries 484 entries 



data file 

90000 

records 



90000 
entries 




1 block 




For the example data file: 
No of block accesses required: 
multi-level index: 4 
single level index: 10 



Second level 

index 

3 blocks 




First level 
index 
484 blocks 



2250 
blocks 
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Index Sequential Access Method (ISAM) Files 

ISAM files - 

Ordered files with a multilevel primary/clustering index 
Insertions: 

Handled using overflow chains at data file blocks 

Deletions: 

Handled using deletion markers 

Most suitable for files that are relatively static 

If the files are dynamic, we need to go for dynamic multi-level 
index structures based on B + - trees 
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B - trees 

■ Balanced search trees 

• all leaves are at the same level 

■ Leaf node entries point to the actual data records 

• all leaf nodes are linked up as a list 

■ Internal node entries carry only index information 

■ In B-trees, internal nodes carry data records also 

■ The fan-out in B-trees is less 

■ Makes sure that blocks are always at least half filled 

■ Supports both random and sequential access of records 
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Order 

Order (m) of an Internal Node 

• Order of an internal node is the maximum number of tree 
pointers held in it. 

• Maximum of (m-1) keys can be present in an internal node 

Order (m leaf ) of a Leaf Node 

• Order of a leaf node is the maximum number of record 
pointers held in it. It is equal to the number of keys in a 
leaf node. 
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Internal Nodes 

An internal node of a B + - tree of order m: 

[ml 

■ It contains at least y pointers, except when it is the root node 

■ It contains at most m pointers. 

■ If it has P 1? P 2 , . . Pj pointers with 

Kj < K 2 < K 3 . . . < Kj^ as keys, where |~^~| < j < m, then 

• Pj points to the subtree with records having key value x < K 

• P { (1 < i < j) points to the subtree with records having 

key value x such that < x < K { 

• P; points to records with key value x > K- x 
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Internal Node Structure 

rm~| 

lyl - J- m 



p, 




?2 


K 2 


• • • 




Pi 




• • • 






• • • 




x< 




K,,<x< K ; 




Kj., < x 



Example 



1 



12 



x< 2 



2<x< 5 



5<x< 12 



x> 12 
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Leaf Node Structure 



Structure of leaf node of B + - of order m 



leaf 



It contains one block pointer P to point to next leaf node 



At least 



m 



leaf 



record pointers and 



m 



leaf 



key values 



At most m leaf record pointers and key values 

If a node has keys K 2 < K 2 < . . . < Kj with Pr 1? Pr 2 . . . Prj as record 

pointers and P as block pointer, then 

Fv { points to record with K { as the search field value, 1 < i < j 
P points to next leaf block 







K 2 


Pr 2 


• • • 






• • • 


P 
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Order Calculation 

Block size: B, Size of Indexing field: V 

Size of block pointer: P, Size of record pointer: P r 

Order of Internal node (m): 

As there can be at most m block pointers and (m-1) keys 

(m*P) + ((m-l)* V)<B 
m can be calculated by solving the above equation. 

Order of leaf node: 
As there can be at most m leaf record pointers and keys 
with one block pointer in a leaf node, 
m leaf can be calculated by solving 
(m leaf * (P r + V)) + P < B 
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Example Order Calculation 

Given B = 512 bytes V = 8 bytes 

P = 6 bytes P r = 7 bytes. Then 

Internal node order m = ? 

m*P + ((m-l)*V)<B 
m* 6 + ((m-l) *8)<512 
14m < 520 
m<37 

Leaf order m leaf = ? 

mieaf(Pr + V) + P<512 

m leaf (7 + 8) + 6<512 
15m leaf <506 

m leaf ^ 33 
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Example B + - tree 



m = 3 m leaf = 2 



7 





4 






/ 











9 






/ 









1 


2 






3 








4 










6 


7 






8 


9 






12 


15 


A 
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Insertion into B + - trees 



1 . Every node is inserted at leaf level 



If leaf node overflows 
• Node is split at j = 



then 

K eaf + 1) 



• First j entries are kept in original node 

• Entities from j+1 are moved to new node 

• j th key value is replicated in the parent of the leaf. 
If Internal node overflows 



Node is split at j = 



(m + 1) 



L 2 J 



• Values and pointers up to Pj are kept in original node 

• j th key value is moved to parent of the internal node 

• Pj +1 to the rest of entries are moved to new node. 
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Example of Insertions 



m = 3 m leaf= 2 

Insert 20, 1 1 




l 



n 



20 



A 



Insert 14 



Overflow, leaf is split 



atj = 



( m ieaf + 1) 



= 2 



11 



14 is replicated to upper level 



14 



14 



A 




20 



A 



Insert 25 



Inserted at 
leaf level 




/ 



14 



A 



Insert 30 



n 



14 



20 



25 



A 



Overflow, 
split at 25. 
25 is moved 
up 



14 




11 


14 






20 


25 






30 
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Insert 12 



Overflow at leaf level. 

- Split at leaf level, 

- Triggers overflow at internal node 

- Split occurs at internal node 







12 






A 


/ 















25 






A 


/ 




\ 








Internal node split 

-j - rti 

split at 14 and 14 is 
moved up 



11 


12 






14 










20 


25 






30 
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Deletion in B - trees 

■ Delete the entry from the leaf node 

■ Delete the entry if it is present in Internal node and replace with 
the entry to its left in that position. 

■ If underflow occurs after deletion 

• Distribute the entries from left sibling 
if not possible - Distribute the entries from right sibling 
if not possible - Merge the node with left and right sibling 
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Example 



12 






11 




12 








14 








20 




22 








23 




24 








25 








30 

































































Delete 20 



14 



Removed entry 
from leaf here 



12 



\ 





22 




/ 







25 





11 




12 








14 








22 








23 




24 








25 








30 
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Delete 22 




Delete 24 





12 




/ 










11 




12 








14 








23 








24 








25 








30 





























































Entry 22 is removed 
from leaf and 
internal node 
Entries from right 
sibling are 
distributed to left 



11 


12 






14 






23 






25 






30 

































Prof P Sreenivasa Kumar 
Department of CS&E, IITM 



58 




Delete 12 




Level drop has occurred 



11 


23 






25 






30 
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Advantages of B - trees: 

1) Any record can be fetched in equal number of disk accesses. 

2) Range queries can be performed easily as leaves are linked up 

3) Height of the tree is less as only keys are used for indexing 

4) Supports both random and sequential access. 

Disadvantages of B + - trees: 

Insert and delete operations are complicated 
Root node becomes a hotspot 
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